Relationships in Structured Text Retrieval

نویسنده

  • Mounia Lalmas
چکیده

SYNONYM None DEFINITION In structured text retrieval, the relationship between text components may be used in ranking components relative to a given query. MAIN TEXT In a structured text document, there exists a relationship between the document components. In the context of XML retrieval, the relationships between elements are provided by the logical structure of the XML markup. An element, unless it is the root element (the document itself), has a parent element, which itself may have a parent element. Similarly, non-leaf elements have children elements, and so on. Considering relationships between elements appears to be beneficial for XML retrieval. For instance, in a collection of scientific articles, it is reasonable to assume that the " abstract " of an article is a better indicator of what the article is about than a " future work " section in the same article. The challenge in XML retrieval is what types of relationship should be considered, and how this information can be used to score elements according to how relevant they are for a given query. In the contextualization approach [1], considering the " root element – element " relationship to rank an element in addition to the element own content has shown to improve retrieval effectiveness [2,3].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه روشی جدید برای شاخص‌گذاری خودکار و استخراج کلمات کلیدی برای بازیابی اطلاعات و خوشه‌بندی متون

Persian words in writing with a diverse and cover all modes of grammatical words with the recruitment of a series of specific rules because it is impossible to extract keywords automatically from Persian texts difficult and complex. This thesis has attempted to use linguistic information and thesaurus, keywords Mnatry be provided. Using the symbol system is structured network can be keywords, i...

متن کامل

Presenting Structured Text Retrieval Results

DEFINITION Presenting structured text retrieval results refers to the fact that, in structured text retrieval, results are not independent and a judgment on their relevance needs to take their presentation into account. For example, HTML/XML/SGML documents contain a range of nested sub-trees that are fully contained in their ancestor elements. As a result, structured text retrieval should make ...

متن کامل

Presenting Semi-Structured Text Retrieval Results

DEFINITION Presenting semi-structured text retrieval results refers to the fact that, in semi-structured text retrieval, results are not independent and a judgment on their relevance needs to take their presentation into account. For example, HTML/XML/SGML documents contain a range of nested sub-trees that are fully contained in their ancestor elements. As a result, semi-structured text retriev...

متن کامل

Image retrieval using the combination of text-based and content-based algorithms

Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...

متن کامل

Integrating a Structured-Text Retrieval System with an Object-Oriented Database System

We describe the integration of a structured-text retrieval system (TextMachine) into an object-oriented database system (OpenODB). Our approach is a light-weight one, using the external function capability of the database system to encapsulate the text retrieval system as an external information source. Yet, we are able to provide a tight integration in the query language and processing; the us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009